Goto

Collaborating Authors

 actual effect




We thank all of the reviewers for their thoughtful feedback, and will incorporate their suggestions into the next version

Neural Information Processing Systems

We thank R1 for their comments and will emphasize the broader implications of our work on model explainability. R2 asked to contrast using (i) influence functions to measure the importance of training points with (ii) existing These papers address a different problem setting from ours and their methods are correspondingly distinct. Despite their differences, these methods could be complementary, as R2 suggested. We will include this discussion and we thank R2 for pointing it out. R3 asked if our empirical findings hold for non-convex models.






We thank all of the reviewers for their thoughtful feedback, and will incorporate their suggestions into the next version

Neural Information Processing Systems

We thank R1 for their comments and will emphasize the broader implications of our work on model explainability. R2 asked to contrast using (i) influence functions to measure the importance of training points with (ii) existing These papers address a different problem setting from ours and their methods are correspondingly distinct. Despite their differences, these methods could be complementary, as R2 suggested. We will include this discussion and we thank R2 for pointing it out. R3 asked if our empirical findings hold for non-convex models.


Most Influential Subset Selection: Challenges, Promises, and Beyond

Hu, Yuzheng, Hu, Pingbang, Zhao, Han, Ma, Jiaqi W.

arXiv.org Machine Learning

How can we attribute the behaviors of machine learning models to their training data? While the classic influence function sheds light on the impact of individual samples, it often fails to capture the more complex and pronounced collective influence of a set of samples. To tackle this challenge, we study the Most Influential Subset Selection (MISS) problem, which aims to identify a subset of training samples with the greatest collective influence. We conduct a comprehensive analysis of the prevailing approaches in MISS, elucidating their strengths and weaknesses. Our findings reveal that influence-based greedy heuristics, a dominant class of algorithms in MISS, can provably fail even in linear regression. We delineate the failure modes, including the errors of influence function and the non-additive structure of the collective influence. Conversely, we demonstrate that an adaptive version of these heuristics which applies them iteratively, can effectively capture the interactions among samples and thus partially address the issues. Experiments on real-world datasets corroborate these theoretical findings and further demonstrate that the merit of adaptivity can extend to more complex scenarios such as classification tasks and non-linear neural networks. We conclude our analysis by emphasizing the inherent trade-off between performance and computational efficiency, questioning the use of additive metrics such as the Linear Datamodeling Score, and offering a range of discussions.


On the Accuracy of Influence Functions for Measuring Group Effects

Koh, Pang Wei, Ang, Kai-Siang, Teo, Hubert H. K., Liang, Percy

arXiv.org Machine Learning

Influence functions estimate the effect of removing particular training points on a model without needing to retrain it. They are based on a first-order approximation that is accurate for small changes in the model, and so are commonly used for studying the effect of individual points in large datasets. However, we often want to study the effects of large groups of training points, e.g., to diagnose batch effect or apportion credit between different data sources. Removing such large groups can result in significant changes to the model. Are influence functions still accurate in this setting? In this paper, we find that across many different types of groups and in a range of real-world datasets, the influence of a group correlates surprisingly well with its actual effect, even if the absolute and relative error can be large. Our theoretical analysis shows that such correlation arises under certain settings but need not hold in general, indicating that real-world datasets have particular properties that keep the influence approximation well-behaved.